Memory efficient alignment between RNA sequences and stochastic grammar models of pseudoknots

نویسندگان

  • Yinglei Song
  • Chunmei Liu
  • Russell L. Malmberg
  • Congzhou He
  • Liming Cai
چکیده

Stochastic Context-Free Grammars (SCFG) has been shown to be effective in modelling RNA secondary structure for searches. Our previous work (Cai et al., 2003) in Stochastic Parallel Communicating Grammar Systems (SPCGS) has extended SCFG to model RNA pseudoknots. However, the alignment algorithm requires O(n4) memory for a sequence of length n. In this paper, we develop a memory efficient algorithm for sequence-structure alignments including pseudoknots. This new algorithm reduces the memory space requirement from O(n4) to O(n2) without increasing the computation time. Our experiments have shown that this novel approach can achieve excellent performance on searching for RNA pseudoknots.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RNA Secondary Structure Prediction with Simple Pseudoknots

Pseudoknots are widely occurring structural motifs in RNA. Pseudoknots have been shown to be functionally important in different RNAs which play regulatory, catalytic, or structural roles in cells. Current biophysical methods to identify the presence of pseudoknots are extremely time consuming and expensive. Therefore, bioinformatics approaches to accurately predict such structures are highly d...

متن کامل

Predicting RNA Secondary Structures with Pseudoknots by MCMC Sampling . — preprint —

The most probable secondary structure of an RNA molecule, given the nucleotide sequence, can be computed efficiently if a stochastic context-free grammar (SCFG) is used as the prior distribution of the secondary structure. The structures of some RNA molecules contain so-called pseudoknots. Allowing all possible configurations of pseudoknots is not compatible with context-free grammar models and...

متن کامل

The language of RNA: a formal grammar that includes pseudoknots

MOTIVATION In a previous paper, we presented a polynomial time dynamic programming algorithm for predicting optimal RNA secondary structure including pseudoknots. However, a formal grammatical representation for RNA secondary structure with pseudoknots was still lacking. RESULTS Here we show a one-to-one correspondence between that algorithm and a formal transformational grammar. This grammar...

متن کامل

RNA, SCFGs and Classifiers

We developed, implemented and tested a stochastic context-free grammar (SCFG)-based method to classify RNA molecules into structural classes, by training grammars to simultaneously recognize certain types of RNAs and disrecognize other types. We tested our program using datasets obtained from thermodynamic (RNAfold) predictions of structures for 6000 tRNA and 6000 miRNA sequences, and we believ...

متن کامل

91 17 v 1 2 9 Se p 20 03 Prediction and statistics of pseudoknots in RNA structures using exactly clustered stochastic simulations

Ab initio RNA secondary structure predictions have long dismissed helices interior to loops, so-called pseudoknots, despite their structural importance. Here, we report that many pseudoknots can be predicted through long time scales RNA folding simulations, which follow the stochastic closing and opening of individual RNA helices. The numerical efficacy of these stochastic simulations relies on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • International journal of bioinformatics research and applications

دوره 2 3  شماره 

صفحات  -

تاریخ انتشار 2006